Aligning Bilingual Literary Works: a Pilot Study

نویسندگان

  • Qian Yu
  • Aurélien Max
  • François Yvon
چکیده

Electronic versions of literary works abound on the Internet and the rapid dissemination of electronic readers will make electronic books more and more common. It is often the case that literary works exist in more than one language, suggesting that, if properly aligned, they could be turned into useful resources for many practical applications, such as writing and language learning aids, translation studies, or data-based machine translation. To be of any use, these bilingual works need to be aligned as precisely as possible, a notoriously difficult task. In this paper, we revisit the problem of sentence alignment for literary works and explore the performance of a new, multi-pass, approach based on a combination of systems. Experiments conducted on excerpts of ten masterpieces of the French and English literature show that our approach significantly outperforms two open source tools.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constrained Hidden Markov Model for Bilingual Keyword Pairs Alignment

Bilingual terminology dictionaries are resources of much practical importance in many application of bilingual NLP. Because technical terminology can be both very specific and rapidly evolving, it can however be difficult to obtain dictionaries with good coverage. Mining automatically such terminology from technical documents is therefore an attractive possibility. With this goal in mind, and f...

متن کامل

Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel

Characters form the focus of various studies of literary works, including social network analysis, archetype induction, and plot comparison. The recent rise in the computational modelling of literary works has produced a proportional rise in the demand for character-annotated literary corpora. However, automatically identifying characters is an open problem and there is low availability of lite...

متن کامل

Canon, Bestseller, and Peripheral Novels:Does the Position of Literary Works in the English Literary Polysystem Influence the Iranian Translators’ Translational Behavior at the Textual Level?

The present study sets out to investigate whether the position of literary works in the English literary polysystem influences the Iranian translators’ translational behavior at the textual level. Given the prominent position of canon and bestseller novels in English literary polysystem, the study intends to find out whether the translators of canon and bestseller novels are faithful to theirso...

متن کامل

A Critical Analysis of the Economic Discourse in Khaqani Shirvani’s Poems

Literary works are the carriers of many regulations, values, norms, beliefs, structures and existential, cultural, and social aspects of their time. Many of the social and cultural realities of past centuries embedded within the extant literary works of those centuries can be identified and followed through. The critical approach in discourse analysis of literary texts provides a more precise k...

متن کامل

Literary Multilingualism I: General Outlines And Western World

The term literary multilingualism primarily refers to the more or less extended mix of two or more languages in the same text, entailing a cross-cultural or experimental effect. Besides intratextual multilingualism, or mixtilingualism, there is an intertextual multilingualism between heteroglot works of different authors linked to each other in a specific way (like those of the European and Lat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012